Picture for Hao Wen

Hao Wen

GRIP-VLM: Group-Relative Importance Pruning for Efficient Vision-Language Models

Add code
May 13, 2026
Viaarxiv icon

EmbodiSkill: Skill-Aware Reflection for Self-Evolving Embodied Agents

Add code
May 11, 2026
Viaarxiv icon

OMGs: A multi-agent system supporting MDT decision-making across the ovarian tumour care continuum

Add code
Feb 14, 2026
Viaarxiv icon

Entropy-Guided Data-Efficient Training for Multimodal Reasoning Reward Models

Add code
Feb 02, 2026
Viaarxiv icon

Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation

Add code
Dec 30, 2025
Viaarxiv icon

From Context to EDUs: Faithful and Structured Context Compression via Elementary Discourse Unit Decomposition

Add code
Dec 18, 2025
Viaarxiv icon

AgentProg: Empowering Long-Horizon GUI Agents with Program-Guided Context Management

Add code
Dec 11, 2025
Figure 1 for AgentProg: Empowering Long-Horizon GUI Agents with Program-Guided Context Management
Figure 2 for AgentProg: Empowering Long-Horizon GUI Agents with Program-Guided Context Management
Figure 3 for AgentProg: Empowering Long-Horizon GUI Agents with Program-Guided Context Management
Figure 4 for AgentProg: Empowering Long-Horizon GUI Agents with Program-Guided Context Management
Viaarxiv icon

How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New Baseline

Add code
Dec 08, 2025
Viaarxiv icon

BudgetThinker: Empowering Budget-aware LLM Reasoning with Control Tokens

Add code
Aug 24, 2025
Viaarxiv icon

DyCrowd: Towards Dynamic Crowd Reconstruction from a Large-scene Video

Add code
Aug 18, 2025
Viaarxiv icon